AITopics | Almaty

Collaborating Authors

Almaty

Overspecified Mixture Discriminant Analysis: Exponential Convergence, Statistical Guarantees, and Remote Sensing Applications

Bolatov, Arman, Legg, Alan, Melnykov, Igor, Nurlanuly, Amantay, Tezekbayev, Maxat, Assylbekov, Zhenisbek

arXiv.org Machine LearningNov-3-2025

This study explores the classification error of Mixture Discriminant Analysis (MDA) in scenarios where the number of mixture components exceeds those present in the actual data distribution, a condition known as overspecification. We use a two-component Gaussian mixture model within each class to fit data generated from a single Gaussian, analyzing both the algorithmic convergence of the Expectation-Maximization (EM) algorithm and the statistical classification error. We demonstrate that, with suitable initialization, the EM algorithm converges exponentially fast to the Bayes risk at the population level. Further, we extend our results to finite samples, showing that the classification error converges to Bayes risk with a rate $n^{-1/2}$ under mild conditions on the initial parameter estimates and sample size. This work provides a rigorous theoretical framework for understanding the performance of overspecified MDA, which is often used empirically in complex data settings, such as image and text classification. To validate our theory, we conduct experiments on remote sensing datasets.

artificial intelligence, convergence, machine learning, (15 more...)

arXiv.org Machine Learning

2510.27056

Country:

Asia > Middle East > Jordan (0.04)
Oceania > Australia (0.04)
North America > United States > Minnesota > St. Louis County > Duluth (0.04)
(9 more...)

Genre: Research Report > New Finding (0.88)

Industry: Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.60)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Russia ramps up pressure on all fronts as Ukraine offers to buy US Patriots

Al JazeeraApr-17-2025, 08:59:47 GMT

Ukraine has reported dozens of civilian deaths from Russian attacks over the past week, including three killed in a late-night assault on Wednesday in the southeastern city of Dnipro. A child was among the victims of the drone attack, which came hours before high-stakes meetings in Paris due to take place later on Thursday, during which United States Secretary of State Marco Rubio and special envoy to the Middle East Steve Witkoff are to meet French President Emmanuel Macron and other European officials to discuss the conflict. Ukraine's defence and foreign ministers, as well as President Volodymyr Zelenskyy's chief of staff, are also in the French capital for talks with US and European Union delegations, though Kyiv's delegation has not specified who it plans to meet. But as Moscow's self-imposed 30-day ceasefire on energy infrastructure approached its close, talks to achieve a broader ceasefire so far have showed little sign of progress. Russia has stuck to its hardline positions while accusing Ukraine of violating the energy ceasefire, to which Kyiv never agreed.

russia, ukraine, zelenskyy, (14 more...)

Al Jazeera

Country:

North America > United States (1.00)
Asia > Russia (1.00)
Europe > France (0.89)
(18 more...)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Government > Regional Government > Europe Government > Russia Government (1.00)
Government > Regional Government > Asia Government > Russia Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.35)
Information Technology > Communications > Social Media (0.30)

Add feedback

Neural Combinatorial Optimization for Real-World Routing

Son, Jiwoo, Zhao, Zhikai, Berto, Federico, Hua, Chuanbo, Kwon, Changhyun, Park, Jinkyoo

arXiv.org Artificial IntelligenceMar-20-2025

Vehicle Routing Problems (VRPs) are a class of NP-hard problems ubiquitous in several real-world logistics scenarios that pose significant challenges for optimization. Neural Combinatorial Optimization (NCO) has emerged as a promising alternative to classical approaches, as it can learn fast heuristics to solve VRPs. However, most research works in NCO for VRPs focus on simplified settings, which do not account for asymmetric distances and travel durations that cannot be derived by simple Euclidean distances and unrealistic data distributions, hindering real-world deployment. This work introduces RRNCO (Real Routing NCO) to bridge the gap of NCO between synthetic and real-world VRPs in the critical aspects of both data and modeling. First, we introduce a new, openly available dataset with real-world data containing a diverse dataset of locations, distances, and duration matrices from 100 cities, considering realistic settings with actual routing distances and durations obtained from Open Source Routing Machine (OSRM). Second, we propose a novel approach that efficiently processes both node and edge features through contextual gating, enabling the construction of more informed node embedding, and we finally incorporate an Adaptation Attention Free Module (AAFM) with neural adaptive bias mechanisms that effectively integrates not only distance matrices but also angular relationships between nodes, allowing our model to capture rich structural information. RRNCO achieves state-of-the-art results in real-world VRPs among NCO methods. We make our dataset and code publicly available at https://github.com/ai4co/real-routing-nco.

artificial intelligence, machine learning, natural language, (12 more...)

arXiv.org Artificial Intelligence

2503.16159

Country:

Asia > East Asia (0.05)
Europe > Northern Europe (0.05)
Asia > Southeast Asia (0.05)
(80 more...)

Genre: Research Report (0.70)

Industry: Transportation > Freight & Logistics Services (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Instruction Tuning on Public Government and Cultural Data for Low-Resource Language: a Case Study in Kazakh

Laiyk, Nurkhan, Orel, Daniil, Joshi, Rituraj, Goloburda, Maiya, Wang, Yuxia, Nakov, Preslav, Koto, Fajri

arXiv.org Artificial IntelligenceFeb-19-2025

Instruction tuning in low-resource languages remains underexplored due to limited text data, particularly in government and cultural domains. To address this, we introduce and open-source a large-scale (10,600 samples) instruction-following (IFT) dataset, covering key institutional and cultural knowledge relevant to Kazakhstan. Our dataset enhances LLMs' understanding of procedural, legal, and structural governance topics. We employ LLM-assisted data generation, comparing open-weight and closed-weight models for dataset construction, and select GPT-4o as the backbone. Each entity of our dataset undergoes full manual verification to ensure high quality. We also show that fine-tuning Qwen, Falcon, and Gemma on our dataset leads to consistent performance improvements in both multiple-choice and generative tasks, demonstrating the potential of LLM-assisted instruction tuning for low-resource languages.

dataset, instruction, kazakhstan, (15 more...)

arXiv.org Artificial Intelligence

2502.13647

Country:

North America > United States (0.14)
Asia > Russia (0.14)
Asia > Kazakhstan > Akmola Region > Astana (0.04)
(18 more...)

Genre:

Research Report (1.00)
Personal (1.00)

Industry:

Law (1.00)
Health & Medicine (1.00)
Banking & Finance (0.93)
(6 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

LLM Modules: Knowledge Transfer from a Large to a Small Model using Enhanced Cross-Attention

Kolomeitsev, Konstantin

arXiv.org Artificial IntelligenceFeb-12-2025

Large language models (LLMs) have demonstrated outstanding performance in natural language processing tasks; however, their training and deployment require significant computational resources. This has led to the need for methods that transfer knowledge from large pre-trained models to smaller models. Such approaches are especially relevant for applied tasks with limited computational resources. In this work, we propose a modular LLM architecture in which a large model serves as a knowledge source, while a smaller model receives external representations via Enhanced Cross-Attention and generates responses. This method significantly reduces training costs while remaining effective for solving specific business tasks.

artificial intelligence, large language model, natural language, (16 more...)

arXiv.org Artificial Intelligence

2502.08213

Country:

Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.05)
Asia > Kazakhstan > Almaty Region > Almaty (0.05)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Machine Learning Co-pilot for Screening of Organic Molecular Additives for Perovskite Solar Cells

Pu, Yang, Dai, Zhiyuan, Zhou, Yifan, Jia, Ning, Wang, Hongyue, Mukhametkarimov, Yerzhan, Chen, Ruihao, Wang, Hongqiang, Liu, Zhe

arXiv.org Artificial IntelligenceDec-18-2024

Machine learning (ML) has been extensively employed in planar perovskite photovoltaics to screen effective organic molecular additives, while encountering predictive biases for novel materials due to small datasets and reliance on predefined descriptors. Present work thus proposes an effective approach, Co-Pilot for Perovskite Additive Screener (Co-PAS), an ML-driven framework designed to accelerate additive screening for perovskite solar cells (PSCs). Co-PAS overcomes predictive biases by integrating the Molecular Scaffold Classifier (MSC) for scaffold-based pre-screening and utilizing Junction Tree Variational Autoencoder (JTVAE) latent vectors to enhance molecular structure representation, thereby enhancing the accuracy of power conversion efficiency (PCE) predictions. Leveraging Co-PAS, we integrate domain knowledge to screen an extensive dataset of 250,000 molecules from PubChem, prioritizing candidates based on predicted PCE values and key molecular properties such as donor number, dipole moment, and hydrogen bond acceptor count. This workflow leads to the identification of several promising passivating molecules, including the novel Boc-L-threonine N-hydroxysuccinimide ester (BTN), which, to our knowledge, has not been explored as an additive in PSCs and achieves a device PCE of 25.20%. Our results underscore the potential of Co-PAS in advancing additive discovery for high-performance PSCs.

artificial intelligence, machine learning, molecule, (18 more...)

arXiv.org Artificial Intelligence

2412.14109

Country:

Asia > China > Shaanxi Province > Xi'an (0.04)
Asia > Kazakhstan > Almaty Region > Almaty (0.04)
Asia > China > Shanghai > Shanghai (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Materials > Chemicals (0.93)
Energy > Renewable > Solar (0.91)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Add feedback

Development of a Service Robot for Hospital Environments in Rehabilitation Medicine with LiDAR Based Simultaneous Localization and Mapping

Ibrayev, Sayat, Ibrayeva, Arman, Amanov, Bekzat, Tolenov, Serik

arXiv.org Artificial IntelligenceNov-7-2024

This paper presents the development and evaluation of a medical service robot equipped with 3D LiDAR and advanced localization capabilities for use in hospital environments. The robot employs LiDAR-based Simultaneous Localization and Mapping SLAM to navigate autonomously and interact effectively within complex and dynamic healthcare settings. A comparative analysis with established 3D SLAM technology in Autoware version 1.14.0, under a Linux ROS framework, provided a benchmark for evaluating our system performance. The adaptation of Normal Distribution Transform NDT Matching to indoor navigation allowed for precise real-time mapping and enhanced obstacle avoidance capabilities. Empirical validation was conducted through manual maneuvers in various environments, supplemented by ROS simulations to test the system response to simulated challenges. The findings demonstrate that the robot integration of 3D LiDAR and NDT Matching significantly improves navigation accuracy and operational reliability in a healthcare context. This study highlights the robot ability to perform essential tasks with high efficiency and identifies potential areas for further improvement, particularly in sensor performance under diverse environmental conditions. The successful deployment of this technology in a hospital setting illustrates its potential to support medical staff and contribute to patient care, suggesting a promising direction for future research and development in healthcare robotics.

medical service robot, robot, service robot, (16 more...)

arXiv.org Artificial Intelligence

2411.04797

Country:

Asia > Singapore (0.04)
Europe > Switzerland (0.04)
Europe > Finland (0.04)
(3 more...)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Health Care Providers & Services > Nursing (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.91)
Information Technology > Artificial Intelligence > Robots > Robots in the Home (0.88)

Add feedback

Dark energy reconstruction analysis with artificial neural networks: Application on simulated Supernova Ia data from Rubin Observatory

Mitra, Ayan, Gómez-Vargas, Isidro, Zarikas, Vasilios

arXiv.org Artificial IntelligenceOct-30-2024

In this paper, we present an analysis of Supernova Ia (SNIa) distance moduli $\mu(z)$ and dark energy using an Artificial Neural Network (ANN) reconstruction based on LSST simulated three-year SNIa data. The ANNs employed in this study utilize genetic algorithms for hyperparameter tuning and Monte Carlo Dropout for predictions. Our ANN reconstruction architecture is capable of modeling both the distance moduli and their associated statistical errors given redshift values. We compare the performance of the ANN-based reconstruction with two theoretical dark energy models: $\Lambda$CDM and Chevallier-Linder-Polarski (CPL). Bayesian analysis is conducted for these theoretical models using the LSST simulations and compared with observations from Pantheon and Pantheon+ SNIa real data. We demonstrate that our model-independent ANN reconstruction is consistent with both theoretical models. Performance metrics and statistical tests reveal that the ANN produces distance modulus estimates that align well with the LSST dataset and exhibit only minor discrepancies with $\Lambda$CDM and CPL.

astrophy, neural network, reconstruction, (12 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.dark.2024.101706

2402.18124

Country:

North America > United States > Illinois > Champaign County > Urbana (0.14)
South America > Chile (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(6 more...)

Genre: Research Report > New Finding (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Boosting K-means for Big Data by Fusing Data Streaming with Global Optimization

Mussabayev, Ravil, Mussabayev, Rustam

arXiv.org Artificial IntelligenceOct-18-2024

K-means clustering is a cornerstone of data mining, but its efficiency deteriorates when confronted with massive datasets. To address this limitation, we propose a novel heuristic algorithm that leverages the Variable Neighborhood Search (VNS) metaheuristic to optimize K-means clustering for big data. Our approach is based on the sequential optimization of the partial objective function landscapes obtained by restricting the Minimum Sum-of-Squares Clustering (MSSC) formulation to random samples from the original big dataset. Within each landscape, systematically expanding neighborhoods of the currently best (incumbent) solution are explored by reinitializing all degenerate and a varying number of additional centroids. Extensive and rigorous experimentation on a large number of real-world datasets reveals that by transforming the traditional local search into a global one, our algorithm significantly enhances the accuracy and efficiency of K-means clustering in big data environments, becoming the new state of the art in the field.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2410.14548

Country:

Asia > Kazakhstan > Almaty Region > Almaty (0.04)
North America > United States (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

FedPT: Federated Proxy-Tuning of Large Language Models on Resource-Constrained Edge Devices

Gao, Zhidong, Zhang, Yu, Zhang, Zhenxiao, Gong, Yanmin, Guo, Yuanxiong

arXiv.org Artificial IntelligenceSep-30-2024

Despite demonstrating superior performance across a variety of linguistic tasks, pre-trained large language models (LMs) often require fine-tuning on specific datasets to effectively address different downstream tasks. However, fine-tuning these LMs for downstream tasks necessitates collecting data from individuals, which raises significant privacy concerns. Federated learning (FL) has emerged as the de facto solution, enabling collaborative model training without sharing raw data. While promising, federated fine-tuning of large LMs faces significant challenges, including restricted access to model parameters and high computation, communication, and memory overhead. To address these challenges, this paper introduces \textbf{Fed}erated \textbf{P}roxy-\textbf{T}uning (FedPT), a novel framework for federated fine-tuning of black-box large LMs, requiring access only to their predictions over the output vocabulary instead of their parameters. Specifically, devices in FedPT first collaboratively tune a smaller LM, and then the server combines the knowledge learned by the tuned small LM with the knowledge learned by the larger pre-trained LM to construct a large proxy-tuned LM that can reach the performance of directly tuned large LMs. The experimental results demonstrate that FedPT can significantly reduce computation, communication, and memory overhead while maintaining competitive performance compared to directly federated fine-tuning of large LMs. FedPT offers a promising solution for efficient, privacy-preserving fine-tuning of large LMs on resource-constrained devices, broadening the accessibility and applicability of state-of-the-art large LMs.

dataset, fedavg, fedpt, (13 more...)

arXiv.org Artificial Intelligence

2410.00362

Country:

Asia > Japan (0.14)
North America > United States > Texas (0.04)
Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)
(5 more...)

Genre:

Research Report > New Finding (0.66)
Research Report > Promising Solution (0.48)

Industry:

Media (1.00)
Leisure & Entertainment (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine (0.92)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback